16 research outputs found
Spread of hate speech in online social media
The present online social media platform is afflicted with several issues,
with hate speech being on the predominant forefront. The prevalence of online
hate speech has fueled horrific real-world hate-crime such as the mass-genocide
of Rohingya Muslims, communal violence in Colombo and the recent massacre in
the Pittsburgh synagogue. Consequently, It is imperative to understand the
diffusion of such hateful content in an online setting. We conduct the first
study that analyses the flow and dynamics of posts generated by hateful and
non-hateful users on Gab (gab.com) over a massive dataset of 341K users and 21M
posts. Our observations confirms that hateful content diffuse farther, wider
and faster and have a greater outreach than those of non-hateful users. A
deeper inspection into the profiles and network of hateful and non-hateful
users reveals that the former are more influential, popular and cohesive. Thus,
our research explores the interesting facets of diffusion dynamics of hateful
users and broadens our understanding of hate speech in the online world.Comment: 8 pages, 5 figures, and 4 tabl
Linguistic representations for fewer-shot relation extraction across domains
Recent work has demonstrated the positive impact of incorporating linguistic
representations as additional context and scaffolding on the in-domain
performance of several NLP tasks. We extend this work by exploring the impact
of linguistic representations on cross-domain performance in a few-shot
transfer setting. An important question is whether linguistic representations
enhance generalizability by providing features that function as cross-domain
pivots. We focus on the task of relation extraction on three datasets of
procedural text in two domains, cooking and materials science. Our approach
augments a popular transformer-based architecture by alternately incorporating
syntactic and semantic graphs constructed by freely available off-the-shelf
tools. We examine their utility for enhancing generalization, and investigate
whether earlier findings, e.g. that semantic representations can be more
helpful than syntactic ones, extend to relation extraction in multiple domains.
We find that while the inclusion of these graphs results in significantly
higher performance in few-shot transfer, both types of graph exhibit roughly
equivalent utility.Comment: ACL 202
Improving Broad-Coverage Medical Entity Linking with Semantic Type Prediction and Large-Scale Datasets
Medical entity linking is the task of identifying and standardizing medical
concepts referred to in an unstructured text. Most of the existing methods
adopt a three-step approach of (1) detecting mentions, (2) generating a list of
candidate concepts, and finally (3) picking the best concept among them. In
this paper, we probe into alleviating the problem of overgeneration of
candidate concepts in the candidate generation module, the most under-studied
component of medical entity linking. For this, we present MedType, a fully
modular system that prunes out irrelevant candidate concepts based on the
predicted semantic type of an entity mention. We incorporate MedType into five
off-the-shelf toolkits for medical entity linking and demonstrate that it
consistently improves entity linking performance across several benchmark
datasets. To address the dearth of annotated training data for medical entity
linking, we present WikiMed and PubMedDS, two large-scale medical entity
linking datasets, and demonstrate that pre-training MedType on these datasets
further improves entity linking performance. We make our source code and
datasets publicly available for medical entity linking research.Comment: 35 page
NARMADA: Need and Available Resource Managing Assistant for Disasters and Adversities
Although a lot of research has been done on utilising Online Social Media
during disasters, there exists no system for a specific task that is critical
in a post-disaster scenario -- identifying resource-needs and
resource-availabilities in the disaster-affected region, coupled with their
subsequent matching. To this end, we present NARMADA, a semi-automated platform
which leverages the crowd-sourced information from social media posts for
assisting post-disaster relief coordination efforts. The system employs Natural
Language Processing and Information Retrieval techniques for identifying
resource-needs and resource-availabilities from microblogs, extracting
resources from the posts, and also matching the needs to suitable
availabilities. The system is thus capable of facilitating the judicious
management of resources during post-disaster relief operations.Comment: ACL 2020 Workshop on Natural Language Processing for Social Media
(SocialNLP